14 research outputs found
Enhanced free space detection in multiple lanes based on single CNN with scene identification
Many systems for autonomous vehicles' navigation rely on lane detection.
Traditional algorithms usually estimate only the position of the lanes on the
road, but an autonomous control system may also need to know if a lane marking
can be crossed or not, and what portion of space inside the lane is free from
obstacles, to make safer control decisions. On the other hand, free space
detection algorithms only detect navigable areas, without information about
lanes. State-of-the-art algorithms use CNNs for both tasks, with significant
consumption of computing resources. We propose a novel approach that estimates
the free space inside each lane, with a single CNN. Additionally, adding only a
small requirement concerning GPU RAM, we infer the road type, that will be
useful for path planning. To achieve this result, we train a multi-task CNN.
Then, we further elaborate the output of the network, to extract polygons that
can be effectively used in navigation control. Finally, we provide a
computationally efficient implementation, based on ROS, that can be executed in
real time. Our code and trained models are available online.Comment: Will appear in the 2019 IEEE Intelligent Vehicles Symposium (IV 2019
Model-based occlusion disentanglement for image-to-image translation
Image-to-image translation is affected by entanglement phenomena, which may
occur in case of target data encompassing occlusions such as raindrops, dirt,
etc. Our unsupervised model-based learning disentangles scene and occlusions,
while benefiting from an adversarial pipeline to regress physical parameters of
the occlusion model. The experiments demonstrate our method is able to handle
varying types of occlusions and generate highly realistic translations,
qualitatively and quantitatively outperforming the state-of-the-art on multiple
datasets.Comment: ECCV 202
ManiFest: Manifold Deformation for Few-shot Image Translation
Most image-to-image translation methods require a large number of training
images, which restricts their applicability. We instead propose ManiFest: a
framework for few-shot image translation that learns a context-aware
representation of a target domain from a few images only. To enforce feature
consistency, our framework learns a style manifold between source and proxy
anchor domains (assumed to be composed of large numbers of images). The learned
manifold is interpolated and deformed towards the few-shot target domain via
patch-based adversarial and feature statistics alignment losses. All of these
components are trained simultaneously during a single end-to-end loop. In
addition to the general few-shot translation task, our approach can
alternatively be conditioned on a single exemplar image to reproduce its
specific style. Extensive experiments demonstrate the efficacy of ManiFest on
multiple tasks, outperforming the state-of-the-art on all metrics and in both
the general- and exemplar-based scenarios. Our code is available at
https://github.com/cv-rits/Manifest .Comment: ECCV 202
Exploring domain-informed and physics-guided learning in image-to-image translation
Image-to-image (i2i) translation networks can generate fake images beneficial for many applications in augmented reality, computer graphics, and robotics. However, they require large scale datasets and high contextual understanding to be trained correctly. In this thesis, we propose strategies for solving these problems, improving performances of i2i translation networks by using domain- or physics-related priors. The thesis is divided into two parts. In Part I, we exploit human abstraction capabilities to identify existing relationships in images, thus defining domains that can be leveraged to improve data usage efficiency. We use additional domain-related information to train networks on web-crawled data, hallucinate scenarios unseen during training, and perform few-shot learning. In Part II, we instead rely on physics priors. First, we combine realistic physics-based rendering with generative networks to boost outputs realism and controllability. Then, we exploit naive physical guidance to drive a manifold reorganization, which allowed generating continuous conditions such as timelapses
Exploration de la connaissance de domaine et de la physique pour l'apprentissage de la translation d'image-Ă -image
Image-to-image (i2i) translation networks can generate fake images beneficial for many applications in augmented reality, computer graphics, and robotics. However, they require large-scale datasets and high contextual understanding to be trained correctly. In this thesis, we propose strategies for solving these problems, improving performances of i2i translation networks by using domain- or physics-related priors. The thesis is divided into two parts. In Part I, we exploit human abstraction capabilities to identify existing relationships in images, thus defining domains that can be leveraged to improve data usage efficiency. We use additional domain-related information to train networks on web-crawled data, hallucinate scenarios unseen during training, and perform few-shot learning. In part II, we instead rely on physics priors. First, we combine realistic physics-based rendering with generative networks to boost outputs realism and controllability. Then, we exploit naive physical guidance to drive a manifold reorganization, which allows generating continuous conditions such as timelapses.Les réseaux de translation d'image à image (i2i) peuvent générer des images synthétiques utiles pour de multiples applications en réalité augmentée, infographie et robotique. Cependant, ils nécessitent des jeux de données à grande échelle et une compréhension contextuelle élevée pour être entraînés correctement. Dans cette thèse, nous proposons des stratégies pour résoudre ces problèmes, en améliorant les performances des réseaux de translation i2i en utilisant des a priori liés au domaine ou à la physique. La thèse est divisée en deux parties. Dans la partie I, nous exploitons les capacités d'abstraction humaines pour identifier les relations existantes dans les images, définissant ainsi des domaines qui peuvent être exploités pour améliorer l'efficacité de l'utilisation des données. Nous utilisons des informations supplémentaires liées au domaine pour entraîner des réseaux sur des données extraites sur le web, pour halluciner des scénarios non observés lors de l'entraînement et pour apprendre avec peu d'exemples. Dans la partie II, nous nous appuyons plutôt sur des a priori physiques. Tout d'abord, nous combinons un rendu réaliste basé sur la physique avec des réseaux génératifs afin de renforcer le réalisme et la contrôlabilité des sorties. Ensuite, nous exploitons un guidage physique naïf pour piloter une réorganisation du manifold, ce qui permet une translation continu par exemple, pour des timelapses
Physics-informed Guided Disentanglement in Generative Networks
Journal submissionImage-to-image translation (i2i) networks suffer from entanglement effects in presence of physics-related phenomena in target domain (such as occlusions, fog, etc), thus lowering the translation quality and variability. In this paper, we present a comprehensive method for disentangling physics-based traits in the translation, guiding the learning process with neural or physical models. For the latter, we integrate adversarial estimation and genetic algorithms to correctly achieve disentanglement. The results show our approach dramatically increase performances in many challenging scenarios for image translation
Physics-informed Guided Disentanglement in Generative Networks
Journal submissionImage-to-image translation (i2i) networks suffer from entanglement effects in presence of physics-related phenomena in target domain (such as occlusions, fog, etc), thus lowering the translation quality and variability. In this paper, we present a comprehensive method for disentangling physics-based traits in the translation, guiding the learning process with neural or physical models. For the latter, we integrate adversarial estimation and genetic algorithms to correctly achieve disentanglement. The results show our approach dramatically increase performances in many challenging scenarios for image translation
CoMoGAN: continuous model-guided image-to-image translation
CVPR 2021 oralInternational audienceCoMoGAN is a continuous GAN relying on the unsupervised reorganization of the target data on a functional manifold. To that matter, we introduce a new Functional Instance Normalization layer and residual mechanism, which together disentangle image content from position on target manifold. We rely on naive physics-inspired models to guide the training while allowing private model/translations features. CoMoGAN can be used with any GAN backbone and allows new types of image translation, such as cyclic image translation like timelapse generation, or detached linear translation. On all datasets, it outperforms the literature. Our code is available at http://github.com/cv-rits/CoMoGAN